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(54) Production and use of normalized DNA libraries 



(57) Disclosed is a process for forming a normalized 
genomic DNA library from an environmental sample by 
(a) isolating a genomic DNA population from the envi- 
ronmental sample; (b) analyzing the complexity of the 
genomic DNA population so isolated; (c) at least one of 
(i) amplifying the copy number of the DNA population so 
isolated and (ii) recovering a fraction of the isolated ge- 



nomic DNA having a desired characteristic; and (d) nor- 
malizing the representation of various DNAs within the 
genomic DNA population so as to form a normalized li- 
brary of genomic DNA from the environmental sample. 
Also disclosed is a normalized genomic DNA library 
formed from an environmental sample by the process. 
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Description 

[0€01] The present invention relates to the field of production and screening of gene libraries, and more particularly 
to the generation and screening of normalized genomic DNA libraries from mixed populations of microbes and/or other 
organisms. 

BACKGROUND OF THE INVENTION 

[0002] There has been increasing demand in the research reagent, diagnostic reagent and chemical process indus- 
tries for protein-based catalysis possessing novel capabilities. At present, this need is largely addressed using enzymes 
purified from a variety of cultivated bacteria or fungi. However, because less than 1% of naturally occurring microbes 
can be grown in pure culture (Amann, 1995), alternative techniques must be developed to exploit the full breadth of 
microbial diversity for potentially valuable new products. 

[0003] Virtually all of the commercial enzymes now in use have come from cultured organisms, Most of these organ- 
isms are bacteria or fungi, Amann et ai (Amann, 1995) have estimated cultivated microorganisms in the environment 
as follows; 



Habitat 


Culturability (%) 


Seawater 


0.001-0,1 


Freshwater 


0.25 


Mesotrophic lake 


0.01-1 ,0 


Unpolluted esturine waters 


0-1-3.0 


Activated sludge 


1.0-15.0 


Sediments 


0,25 


Soil 


0.3 



[0004] These data were determined from published information regarding the number of cultivated microorganisms 
derived from the various habitats indicated. 

[0005] Other studies have also demonstrated that cultivated organisms comprise only a small fraction of the biomass 
present in the environment. For example, one group of workers recently reported the collection of water and sediment 
samples from the "Obsidian Pool" in Yellowstone National Park (Barns, 1994) where they found cells hybridizing to 
archaea-spedfic probes in 55% of 75 enrichment cultures. Amplification and cloning of 16S rRN A encoding sequences 
revealed mostly unique sequences with tittle or no representation of the organisms which had previously been cultured 
from this pool, suggesting the existence of substantial diversity of archaea with so far unknown morphological, phys- 
iological and biochemical features. Another group performed similar studies on the cyanobacteriai mat of Octopus 
Spring in Yellowstone Park and came to the same conclusion; namely, tremendous uncultured diversity exists (Ward, 
1990). Giovannoni etal. (1990)and Torsvik etaf. (1990a) have reported similar results using bacterioplankton collected 
in the Sargasso Sea and in soil samples, respectively. These results indicate that the exclusive use of cultured organ- 
isms in screening for useful enzymatic or other bioactivities severely limits the sampling of the potential diversity in 
existence. 

[0006] Screening of gene libraries from cultured samples has already proven valuable. It has recently been made 
clear, however, that the use of only cultured organisms for library generation limits access to the diversity of nature. 
The uncultivated organisms present in the environment, and/or enzymes or other bioactivities derived thereof, may be 
useful in industrial processes. The cultivation of each organism represented in any given environmental sample would 
require significant time and effort. It has been estimated that in a rich sample of soil, more than 1 0,000 different species 
can be present. It is apparent that attempting to individually cultivate each of these species would be a cumbersome 
task. Therefore, novel methods of efficiently accessing the diversity present in the environment are highly desirable. 

SUMMARY O F THE INVENTION 

[0007] The present invention addresses this need by providing methods to isolate the DNA from a variety of sources, 
including isolated organisms, consortias of microorganisms> primary enrichments, and environmental samples, to make 
libraries which have been "normalized" in their representation of the genome populations in the original samples, and 
to screen these libraries for enzyme and other bioactivities. 

[0008] The present invention represents a novel, recombinant approach to generate and screen DNA libraries con- 
structed from mixed microbial populations of cultivated or, preferably, uncultivated (or "environmental") samples. In 
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accordance with the present invention, libraries with equivalent representation of genomes from microbes that can 
differ vastly in abundance in natural populations are generated and screened. This "normalization'* approach reduces 
the redundancy of clones from abundant species and increases the representation of clones from rare species. These 
normalized libraries allow for greater screening efficiency resulting in the isolation of genes encoding novel biological 
5 catalysts. 

[0009] Screening of mixed populations of organisms has been made a rational approach because of the availability 
of techniques described herein, whereas previously attempts at screening of mixed population were not feasible and 
were avoided because of the cumbersome procedures required, 

[0010] Thus, in one aspect the invention provides a process for forming a normalized genomic DNA library from an 
10 environmental sample by (a) isolating a genomic DNA population from the environmental sample; (b) analyzing the 
complexity of the genomic DNA population so isolated; (c) at least one of (i) amplifying the copy number of the DNA 
population so isolated and (ii) recovering a fraction of the isolated genomic DNA having a desired characteristic; and 
(d) normalizing the representation of various DNAs within the genomic DNA population so as to form a normalized 
library of-genomic DNA from the environmental sample, 
f $ [0011] In one preferred embodiment of this aspect, the process comprises the step of recovering a fraction of the 
isolated genomic DNA having a desired characteristic, 

[0012] In another preferred embodiment of this aspect, the process comprises the step of amplifying the copy number 
of the DNA population so isolated, 

[0013] in another preferred embodiment of this aspect, the step of amplifying the genomic DNA precedes the nor- 
20 malizing step, in an alternate preferred embodiment of this aspect, the step of normalizing the genomic DNA precedes 
the amplifying step, 

[0014] In another preferred embodiment of this aspect, the process comprises both the steps of (i) amplifying the 
copy number of the DNA population so isolated and (ii) recovering a fraction of the isolated genomic DNA having a 
desired characteristic. 

25 [0015] Another aspect of the invention provides a normalized genomic DNA library formed from from an environ- 
mental sample by a process comprising the steps of (a) isolating a genomic DNA population from the environmental 
sample; (b) analyzing the complexity of the genomic DNA population so isolated; (c) at least one of (i) amplifying the 
copy number of the DNA population so isolated and (ii) recovering a fraction of the isolated genomic DNA having a 
desired characteristic; and (d) normalizing the representation of various DNAs within the genomic DNA population so 

30 as to form a normalized library of genomic DNA from the environmental sample. The various preferred embodiments 
described with respect to the above method aspect of the invention are likewise applicable with regard to this aspect 
of the invention. 

[0016] The invention also provides a process for forming a normalized genomic DNA library from an environmental 
sample by (a) isolating a genomic DNA population from the environmental sample; (b) analyzing the complexity of the 
35 genomic DNA population so isolated; (c) at least one of (I) amplifying the copy number of the DNA population so isolated 
and (ii) recovering a fraction of the isolated genomic DNA having a desired characteristic; and (d) normalizing the 
representation of various DNAs within the genomic DNA population so as to form a normalized library of genomic DNA 
from the environmental sample. 

Another aspect of the invention provides a normalized genomic DNA library formed from from an environmental sample 
40 by a process comprising the steps of (a) isolating a genomic DNA population from the environmental sample; (b) 
analyzing the complexity of the genomic DNA population so isolated; (c) at least one of (i) amplifying the copy number 
of the DNA population so isolated and (ii) recovering a fraction of the isolated genomic DNA having a desired charac- 
teristic; and (d) normalizing the representation of various DNAs within the genomic DNA population so as to form a 
normalized library of genomic DNA from the environmental sample, The various preferred embodiments described 
45 with respect to the above method aspect of the invention are likewise applicable with regard to this aspect of the 
invention. 

BRIEF DESCRIPTIO N OF THE DRAWING 

50 [0017] Figure 1 is a graph showing the percent of total DNA content represented by G + C in the various genomic 
DNA isolates tested as described in Example 2. 

DETAILED DESCRIPTION OF THE INVENTION 

55 DNA ISOLATION: 

[0018] An important step in the generation of a normalized DNA library from an environmental sample is the prepa- 
ration of nucleic acid from the sample. DNA can be isolated from samples using various techniques well known in the 
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art (Nucleic Acids in the Environment Methods & Applications, JT. Trevors, D.D. van Elsas, Springer Laboratory, 1995). 
Preferably, DNA obtained will be of large size and free of enzyme inhibitors and other contaminants. DNA can be 
isolated directly from the environmental sample (direct lysis) or cells may be harvested from the sampte prior to DNA 
recovery (cell separation), Direct lysis procedures have several advantages over protocols based on cell separation. 

s The direct lysis technique provides more DNA with a generally higher representation of the microbial community, how- 
ever, it is sometimes smaller in size and more likely to contain enzyme inhibitors than DNA recovered using the ceil 
separation technique. Very useful direct lysis techniques have recently been described which provide DNA of high 
molecular weight and high purity (Barns, 1994; Holben, 1994). If inhibitors are present, there are several protocols 
which utilize cell isolation which can be employed (Holben, 1994). Additionally, a fractionation technique, such as the 

10 bis-benzimide separation (cesium chloride isolation) described below, can be used to enhance the purity of the DNA, 

ANALYSIS OF COMPLEXITY: 

[001 9] Analysis of the complexity of the nucleic acid recovered from the environmental samples can be important to 
15 monitor during the isolation and normalization processes, 16S rRNA analysts is one technique that can be used to 
analyze the complexity of the DNA recovered from environmental samples (Reysenbach, 1992; DeLong, 1992; Barns, 
1994). Primers have been described for the specific amplification of 16S rRNA genes from each of the three described 
domains. 

20 FRACTIONATION: 

[0020] Fractionation of the DNA samples prior to normalization increases the chances of cloning DNA from minor 
species from the pool of organisms sampled, In the present invention, DNA is preferably fractionated using a density 
centrifugatton technique. One example of such a technique is a cesium-chloride gradient. Preferably, the technique \s 

25 performed in the presence of a nucleic acid intercalating agent which will bind regions of the DNA and cause a change 
in the buoyant density of the nucleic acid. More preferably, the nucleic acid intercalating agent is a dye, such as bis- 
benzimide which will preferentially bind regions of DNA (AT in the case of bis-benzimide) (Muller, 1975; Manuelidis, 
1977). When nucleic acid complexed with an intercalating agent, such as bts-benzimide, is separated in an appropriate 
cesium-chloride gradient, the nucleic acid is fractionated, if the intercalating agent preferentially binds regions of the 

30 DNA, such as GC or AT regions, the nucleic acid is separated based on relative base content in the DNA. Nucleic acid 
from multiple organisms can be separated in this manner. 

[0021] Density gradients are currently employed to fractionate nucleic acids, For example, the use of bis-benzimide 
density gradients for the separation of microbial nucleic acids for use in soil typing and bioremediation has been de- 
scribed. In these experiments, one evaluates the relative abundance of A 260 peaks within fixed benzimide gradients 

35 before and after remediation treatment to see how the bacterial populations have been affected. The technique relies 
on the premise that on the average, the GC content of a species is relatively consistent. This technique is applied in 
the present invention to fractionate complex mixtures of genomes. The nucleic acids derived from a sample are sub- 
jected to uitracentrifugation and fractionated while measuring the A260 as ' n the published procedures. 
[0022] In one aspect of the present invention, equal A 260 un ^ s are removed from each peak, the nucleic acid is 

40 amplified using a variety of amplification protocols known In the art, including those described hereafter, and gene 
libraries are prepared. Alternatively, equal A 26 o units are removed from each peak, and gene libraries are prepared 
directly from this nucleic acid. Thus, gene libraries are prepared from a combination of equal amounts of DNA from 
each peak. This strategy enables access to genes from minority organisms within environmental samples and enrich- 
ments, whose genomes may not be represented or may even be lost, due to the fact that the organisms are present 

45 in such minor quantity, if a library was construed from the total unfractionated DNA sample. Alternatively, DNA can be 
normalized subsequent to fractionation, using techniques described hereafter, DNA libraries can then be generated 
from this fractionated/normalized DNA. 

[0023] The composition of multiple fractions of the fractionated nucleic acid can be determined using PCR related 
amplification methods of classification well known in the art. 

50 

NORMALIZATION: 

[0024] Previous normalization protocols have been designed for constructing normalized cDNA libraries (WO 
95/08647 , WO 95/11 986). These protocols were originally developed for the cloning and isolation of rare cDNA's derived 
55 from mRNA, The present invention relates to the generation of normalized genomic DNA gene libraries from uncultured 
or environmental samples. 

[0025] Nucleic acid samples isolated directly from environmental samples or from primary enrichment cultures will 
typically contain genomes from a large number of microorganisms. These complex communities of organisms can be 
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described by the absolute number of species present within a population and by the relative abundance of each or- 
ganisms within the sample. Total normalization of each organisms within a sample is very difficult to achieve. Separation 
techniques such as optical tweezers can be used to pick morphologically distinct members with a sample. Cells from 
each member can then be combined In equal numbers or pure cultures of each member within a sample can be 
5 prepared and equal numbers of cells from each pure culture combined to achieve normalization. In practice, this is 
very difficult to perform, especially in a high thru-put manner, 

[0026] The present invention involves the use of techniques to approach normalization of the genomes present within 
an environmental sample, generating a DNA library from the normalized nucleic acid, and screening the library for an 
activity of interest 

10 [0027] In one aspect of the present invention, DNA is isolated from the sample and fractionated. The strands of 
nucleic acid are then melted and allowed to selectively reanneal under fixed conditions (C 0 t driven hybridization). 
Alternatively, DNA is not fractionated prior to this melting process. When a mixture of nucleic acid fragments is melted 
and allowed to reanneal under stringent conditions, the common sequences find their complementary strands faster 
than the rare sequences. After an optional single-stranded nucleic acid isolation step, single-stranded nucleic acid, 

15 representing an enrichment of rare sequences, is amplified and used to generate gene libraries, This procedure leads 
to the amplification of rare or low abundance nucleic acid molecules, These molecules are then used to generate a 
library. While all DNA m\\ be recovered, the identification of the organism originally containing the DNA may be lost. 
This method offers the ability to recover DNA from "unclonable sources." 

[0028] Nucleic acid samples derived using the previously described technique are amplified to complete the normal- 
20 ization process. For example, samples can be amplified using PGR amplification protocols such as those described 
by Ko etai {Ko, 1990b; Ko, 1990a, Takahashi, 1994), or more preferably, long PCR protocols such as those described 
by Barnes (1994) or Cheng {1994), 

[0029] Normalization can be performed directly, or steps can also be taken to reduce the complexity of the nucleic 
acid pools prior to the normalization process. Such reduction in complexity can be beneficial in recovering nucleic acid 

25 from the poorly represented organisms. 

[0030] The microorganisms from which the libraries may be prepared include prokaryotic microorganisms, such as 
Eubacteria and Archaebacteria, and lower eukaryotic microorganisms such as fungi, some algae and protozoa. The 
microorganisms may be cultured microorganisms or uncultured microorganisms obtained from environmental samples 
and such microorganisms may be extremophiles, such as thermophiles, hyperthermophties, psychrophiles, psychro- 

30 trophs, etc. 

[0031] As indicated above, the library may be produced from environmental samples in which case DNA may be 
recovered without culturing of an organism or the DNA may be recovered from a cultured organism, 
[0032] Sources of microorganism DNA as a starting material library from which target DNA is obtained are particularly 
contemplated to include environmental samples, such as microbial samples obtained from Arctic and Antarctic ice, 

35 water or permafrost sources, materials of volcanic origin, materials from soil or plant sources in tropical areas, etc. 
Thus, for example, genomic DNA may be recovered from either a culturable or non-culturable organism and employed 
to produce an appropriate recombinant expression library for subsequent determination of enzyme activity. 
[0033] Bacteria and many eukaryotes have a coordinated mechanism for regulating genes whose products are in- 
volved in related processes. The genes are clustered, in structures referred to as "gene clusters," on a single chromo- 

40 some and are transcribed together under the control of a single regulatory sequence, including a single promoter which 
initiates transcription of the entire cluster. The gene cluster, the promoter, and additional sequences that function in 
regulation altogether are referred to as an "operon" and can include up to 20 or more genes, usually from 2 to 6 genes. 
Thus, a gene cluster is a group of adjacent genes that are either identical or related, usually as to their function, 
[0034] Some gene families consist of identical members. Clustering is a prerequisite for maintaining identity between 

45 genes, although clustered genes are not necessarily identical. Gene clusters range from extremes where a duplication 
is generated to adjacent related genes to cases where hundreds of identical genes lie in a tandem array. Sometimes 
no significance is discernable in a repetition of a particular gene, A principal example of this is the expressed duplicate 
insulin genes in some species, ) whereas a single insulin gene is adequate in other mammalian species, 
[0035] It is important to further research gene clusters and the extent to which the full length of the cluster is necessary 

&o for the expression of the proteins resulting therefrom. Further, gene clusters undergo continual reorganization and, 
thus, the ability to create heterogeneous libraries of gene clusters from, for example, bacterial or other prokaryote 
sources is valuable in determining sources of novel proteins, particularly including enzymes such as, for example, the 
polyketide synthases that are responsible for the synthesis of polyketides having a vast array of useful activities, Other 
types of proteins that are the product(s) of gene clusters are also contemplated, including, for example, antibiotics, 
antivirals, antitumor agents and regulatory proteins, such as insulin. 

[0036] Polyketides are molecules which are an extremely rich source of bioactivities, including antibiotics (such as 
tetracyclines and erythromycin), anti-cancer agents (daunomycin), immunosuppressants (FK506 and raparnycin), and 
veterinary products (monensin). Many polyketides (produced by polyketide synthases) are valuable as therapeutic 
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agents. Poiyketide synthases are multifunctional enzymes that catalyze the biosynthesis of a hugh variety of carbon 
chains differing in length and patterns of functionality and cyclization. Poiyketide synthase genes fall into gene clusters 
and at least one type {designated type I) of poiyketide synthases have large size genes and enzymes, complicating 
genetic manipulation and in vitro studies of these genes/proteins, 
s [0037] The ability to select and combine desired components from a library ofpolyketides and postpolyketide bio- 
synthesis genes for generation of novel polyketldes for study is appealing, The method(s) of the present invention 
make it possible to and facilitate the ctoning of novel poiyketide synthases, since one can generate gene banks with 
clones containing large inserts {especially when using the f-factor based vectors), which facilitates cloning of gene 
clusters. 

10 [0038] Preferably, the gene cluster DNA is ligated into a vector, particularly wherein a vector further comprises ex- 
pression regulatory sequences which can control and regulate the production of a detectable protein or protein-related 
array activity from the ligated gene clusters. Use of vectors which have an exceptionally large capacity for exogenous 
DNA introduction are particularly appropriate for use with such gene clusters and are described by way of example 
herein to include the f-factor (or fertility factor) of E, coil This f-factor of E coii is a plasmid which affect high-frequency 

15 transfer of itself during conjugation and is ideal to achieve and stably propagate large DNA fragments, such as gene 
clusters from mixed microbial samples. 

LIBRARY SCREENING: 

20 [0039] After normalized libraries have been generated, unique enzymatic activities can be discovered using a variety 
of solid- or liquid-phase screening assays in a variety of formats, including a high-throughput robotic format described 
herein. The normalization of the DNA used to construct the libraries is a key component in the process. Normalization 
will increase the representation of DNA from important organisms, including those represented in minor amounts in 
the sample. 

25 [0040] The following items aiso illustrate the invention: 

1. A process for producing a normalized genomic DNA library from an environmental sample, which comprises 
the steps of: 

30 {a) isolating a genomic DNA population from the environmental sample; 

{b) analyzing the complexity of the genomic DNA population so isolated; 

(c) at least one of the steps selected from the group consisting of (i) amplifying the copy number of the DNA 
population so isolated and (ii) recovering a fraction of the isolated genomic DNA having a desired characteristic; 
and 

35 (d) normalizing the representation of various DNAs within the genomic DNA population so as to form a nor- 

malized library of genomic DNA from the environmental sample, 

2. The process of item 1 which comprises the step of recovering a fraction of the isolated genomic DNA having a 
desired characteristic. 

40 

3. The process of item 1 which comprises the step of amplifying the copy number of the DNA population so isolated, 

4. The process of item 1 wherein the step of amplifying the genomic DNA precedes the normalizing step. 

4$ 5. The process of item 1 wherein the step of normalizing the genomic DNA precedes the amplifying step. 

6, The process of item 1 which comprises both the steps of (I) amplifying the copy number of the DNA population 
so isolated and (ii) recovering a fraction of the isolated genomic DNA having a desired characteristic. 

50 7. A normalized genomic DNA library formed from an environmental sample by a process comprising the steps of: 

(a) isolating a genomic DNA population from the environmental sample; 

(b) analyzing the complexity of the genomic DNA population so isolated; 

(c) at least one of (i) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
55 of the isolated genomic DNA having a desired characteristic; and 

(d) normalizing the representation of various DNAs within the genomic DNA population so as to form a nor- 
malized library of genomic DNA from the environmental sample. 
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8, The library of item 1 wherein the process of forming said library comprises the step of recovering a fraction of 
the isolated genomic DNA having a desired characteristic. 

9, The library of item 1 wherein the process of forming said library comprises the step of amplifying the copy number 
5 of the DNA population so isolated. 

10, The library of item 1 wherein in the process of forming said library the step of amplifying the genomic DNA 
precedes the normalizing step. 

10 11, The library of item 1 wherein in the process of forming said library the step of normalizing the genomic DNA 

precedes the amplifying step, 

12. The library of item 1 wherein the process of forming said library comprises both the steps of (I) amplifying the 
copy number of the DNA population so isolated and (ii) recovering a fraction of the isolated genomic DNA having 

1$ a desired characteristic, 

13. A process for forming a normalized library of genomic gene clusters from an environmental sample which 
comprises 

20 (a) isolating a genomic DNA population from the environmental sample; 

(b) analyzing the complexity of the genomic DNA population so isolated; 

(c) at least one of (i) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
of the isolated genomic DNA having a desired characteristic; and 

(d) normalizing the representation of various DNAs within the genomic DNA population so as to form a nor- 
25 malized library of genomic DNA from the environmental sample. 

14. A normalized library of genomic gene clusters formed from an environmental sample by a process comprising 
the steps of 

30 (a) isolating a genomic DNA population from the environmental sample; 

(b) analyzing the complexity of the genomic DNA population so isolated; 

(c) at least one of (I) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
of the isolated genomic DNA having a desired characteristic; and 

(d) normalizing the representation of various DNAs within the genomic DNA population so as to form a nor- 
35 malized library of genomic DNA from the environmental sample. 

Example 1 

DNA Isolation 

40 

[0041] 

1. Samples are resuspended directly in the following buffer: 

46 500mM Tris-HCI, pH 8.0 

100mM NaCI 
1mM sodium citrate 
100jutg/ml polyadenosine 
5mg/ml lysozyme 

50 

2. Incubate at 37°C for 1 hour with occasional agitation. 

3. Digest with 2mg/mi Proteinase K enzyme (Boehringer Mannheim) at 37°C for 30 min. 

4. Add 8 mi of lysis buffer [200 mM Tris-HCI, pH 8.0/100 mM NaCI/4% (wt/vol) SDS/10% (wt/vol) 4-aminosalicylate] 
and mix gently by inversion. 

55 5. Perform three cycles of freezing in a dry ice-ethanol bath and thawing in a 65 °C water bath to release nucleic 

acids, 

6. Extract the mixture with phenol and then phenol/chloroform/isoamyl alcohol, 

7. Add 4 grams of acid-washed polyvinylpyrrolidone (PVPP) to the aqueous phase and incubate 30 minutes 
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at 37°C to remove organic contamination, 

8, Pellet PVPP and filter the supernatant through a 0,45 jiinn membrane to remove residual PVPP 

9, Precipitate nucleic acids with Esopropy! alcohol. 

10, Resuspend pellet in 500 jjtl IE (10 mM Tris-HCI, pH 8.0/1,0 mM EDTA) 

11 , Add 0.1 g of ammonium acetate and centrifuge mixture at 4 °0 for 30 minutes. 

12, Precipitate nucleic acids with isopropanoL 

Example 2 

Bis-Benzimide Separation of DNA 

[0042] Sample composed of genomic DNA from Clostridium perfringens (27% G+C), Escherichia coli (49% G+C) 
and Micrococcus lysodictium (72% G+C) was purified on a cesium-chloride gradient. The cesium chloride (Rf = 1 .3980) 
solution was filtered through a 0,2 jutm filter and 15 ml were loaded into a 35 ml OptiSeal tube (Beckman). The DNA 
was added and thoroughly mixed. Ten micrograms of bis-benztmide (Sigma; Hoechst 33258) were added and mixed 
thoroughly. The tube was then filled with the filtered cesium chloride solution and spun in a VTi50 rotor in a Beckman 
18-70 Uitracentrifuge at 33,000 rpm for 72 hours. Following centrifugation, a syringe pump and fractionator (Brandel 
Model 186) were used to drive the gradient through an tSCO UA-5 UV absorbance detector set to 280 nm. Three peaks 
representing the DNA from the three organisms were obtained, PCR amplification of DNA encoding rRNA from a 1 0-fold 
dilution of the E coff peak was performed with the following primers to amplify eubacterial sequences: 



Forward primer: (27F) 

5 AGAGTTTGATCCTGGCTC AG-3 ' 

Reverse primer: (1492R) 

5 '-GGTTACCTTGTTACGACTT-3 ' 

Example 3 

Sample of DNA obtained from the gill tissue of a clam harboring an endosymbiont which cannot be p hysically 
separate d from its host 

[0043] 

1. Purify DNA on cesium chloride gradient according to published protocols (Sambrook, 1989). 

2. Prepare second cesium chloride solution; (Rf = 13980) filter through 0.2^m filter and load 15ml into a 35ml 
OptiSeal tube (Beckman). 

3. Add 10^ig bis™benzimide (Sigma; Hoechst 33258) and mix. 

4. Add SOjig purified DNA and mix thoroughly. 

5. Spin in a VTI50 rotor in a Beckman 18-70 Uitracentrifuge at 33,000 rpm for 72 hours. 

6. Use syringe pump and fractionator (Brandei Model 1 86) to drive gradient through an ISCO UA-5 UV absorbance 
detector set to 280nm. 

Example 4 

Complexity Analysis 

[0044] 

1. 16S rRNA analysis is used to analyze the complexity of the DNA recovered from environmental samples (Rey- 
senbach, 1992; DeLong, 1992; Barns, 1994) according to the protocol outlined in Example 1. 
2. Eubacterial sequences are amplified using the following primers; 
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Forward: 

5 '-AGAGTTTGATCCTGGCTCAG-3 ' 



Reverse: 



5 ' -GGTTACCTTGTTACG ACTT-3 ' 

Archaea! sequences are amplified using the following primers: 

Forward: 

GCGGATCCGCGGGCGCTGCACAYCTGGTYGATYCTGCC^' 

Reverse: 

5 '-GACGGGCGGTGTGTRCA-3 ' (R-purine,; Y- 
pyrixnidine) 

3. Amplification reactions proceed as published. The reaction buffer used in the amplification of the archaeal se- 
quences includes 5% acetamide (Bams, 1994), 

4. The products of the amplification reactions are rendered blunt ended by incubation with Pfu DNA polymerase. 

5. Blunt end ligation into the pCR-Script plasmid in the presence of Srfi restriction endonuclease according to the 
manufacturer's protocol (Strategene Cfoning Systems). 

6. Samples are sequenced using standard sequencing protocols (reference) and the number of different sequences 
present in the sample is determined. 

Example 5 

Normalization 

[0045] Purified DNA is fractionated according to the bis-benzimide protocol of Example (2), and recovered DNA is 
sheared or enzymatically digested to 3-6 kb fragments. Lone-linker primers are ligated and the DNA is sized selected. 
Size-selected DNA is amplified by PGR, if necessary, 
[D046] Normalization is then accomplished as follows: 

1, Double-stranded DNA sample is resuspended in hybridization buffer (0.12 M NaH 2 P0 4 , pH 6,8/0,82 M NaCI/i 
mM EDTA/0.1%SDS). 

2. Sample is overlaid with mineral oil and denatured by boiling for 10 minutes, 

3, Sample is incubated at 68°C for 12-36 hours, 

4. Double-stranded DNA is separated from single-stranded DNA according to standard protocols (Sambrook, 1989) 
on hydroxy apatite at 60°C, 

5. The single-stranded DNA fraction is desalted and amplified by PCR. 

6, The process is repeated for several more rounds (up to 5 or more). 
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Example 6 
Library Construction 
5 {0047] 

1 Genomic DNA dissolved in TE buffer is vigorously passed through a 25 gauge double-hubbed needle unti! the 
sheared fragments are in the desired size range. 
2, DNA ends are "polished" or blunted with Mung Bean nuclease. 
10 3. EcoRI restriction sites in the target DNA are protected with EcoR! methylase. 

4. EcoRI linkers [GGAATTCC] are iigated to the blunted/protected DNA using a very high molar ratio of linkers to 
target DNA, 

5 T Linkers are cut back with EcoR! restriction endonuclease and the DNA is size fractionated using sucrose gra- 
dients. 

is 6. Target DNA is Iigated to the XZAPU vector, packaged using in vitro lambda packing extracts, and grown in the 

appropriate E. coli XLI Blue host ceiL 

Example 7 

20 Library Screening 

£0048] The following is a representative example of a procedure for screening an expression library prepared in 
accordance with Example 6. 

[0049] The general procedures for testing for various chemical characteristics is generally applicable to substrates 

£5 other than those specifically referred to in this Example. 

[0050] Screening for Activity, Plates of the library prepared as described in Example 6 are used to multiply inoculate 
a single plate containing 200 \il of LB Amp/Meth, glycerol in each well. This step is performed using the High Density 
Replicating Tool (HDRT) of the Beckman Biomek with a 1 % bleach, water, tsopropanol, air-dry sterilization cycle 
between each inoculation. The single plate is grown for 2h at 37 Q C and is then used to inoculate two white 96-wel! 

30 Dynatech microtiter daughter plates containing 250 yL ofLB Amp/Meth t glycerol in each well. The original single plate 
is incubated at 37°C for 18h, then stored at -80°C, The two condensed daughter plates are incubated at 37°C also for 
1 8 h. The condensed daughter plates are then heated at 70°C for 45 min, to kill the cells and inactivate the host E,coli 
enzymes. A stock solution of 5mg/mL morphourea pheny!aianyl-7-amino-4-trifluoromethyi coumarin (MuPheAFC, the 
'substrate') in DMSO is diluted to 600 \xU with 50 mM pH 7.5 Hepes buffer containing 0,6 mg/mL of the detergent 

35 dodecyl maltoside, 

MuPheAFC 

[0051] Fifty ^iL of the 600 \iM MuPheAFC solution is added to each of the wells of the white condensed plates with 
to one 100 |xL mix cycfe using the Biomek to yield a final concentration of substrate of - 100 jiM, The fluorescence values 
are recorded (excitation = 400 nm, emission ™ 505 nm) on a plate reading fiuorometer immediately after addition of 
the substrate (t=0). The plate is incubated at 70°C for 100 min, then allowed to cool to ambient temperature for 15 
additional minutes. The fluorescence values are recorded again (t=100). The values at t~0 are subtracted from the 
values at 1=100 to determine if an active clone is present. 
45 [0052] The data will indicate whether one of the clones in a particular well is hydrolyzing the substrate. In order to 
determine the individual clone which carries the activity, the source library plates are thawed and the individual clones 
are used to singly inoculate a new plate containing LB Amp/Meth, glycerol. As above, the plate is incubated at 37 0 C 
to grow the cells, heated at 70°C to inactivate the host enzymes, and 50 \il of 600 p.M MuPheAFC is added using the 
Biomek. Additionally three other substrates are tested. They are methyl umbelliferone heptanoate, the CBZ-arginine 
50 rhodamine derivative, and fluorescein-conjugated casein (-3,2 mol fluorescein per mo! of casein). 

[0053] The umbelliferone and rhodamine are added as 600 ^M stock solutions in 50 \xl of Hepes buffer The fluo- 
rescein conjugated casein is also added in 50 \xL at a stock concentration of 20 and 200 mg/mL. After addition of the 
substrates the t=0 fluorescence values are recorded, the plate is incubated at 70°C, and the t~100 min. values are 
recorded as above, 

55 [0054] These data indicate which plate the active clone is in, where the arginine rhodamine derivative is also turned 
over by this activity, but the lipase substrate, methyl umbelliferone heptanoate, and protein, fluorescein-conjugated 
casein, do not function as substrates. 

[0055] Chiral amino esters may be determined using at least the following substrates: For each substrate which is 
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turned over the enantioseleclivity value, E t is determined according to the equation below; 

£ _ ln[(1-c(1+ee p )] 

5 lnl(1-c(1"0e p }] 

where eep = the enantiomeric excess (ee) of the hydrolyzed product and c = the percent conversion of the reaction. 
See Wong and Whitesides, Enzymes in Synthetic Organic Chemistry, 1994, Elsevier, Tarrytown, New York, pp. 9-12. 
[0056] The enantiomeric excess is determined by either chirai high performance liquid chromatography (HPLC) or 
10 chirai capillary electrophoresis (CE). Assays are performed as follows: two hundred of the appropriate buffer is 
added to each well of a 96-wei! white microtiter plate, followed by 50 \xl of partially or completely purified enzyme 
solution; 50 of substrate is added and the increase in fluorescence monitored versus time until 50% of the substrate 
is consumed or the reaction stops, whichever comes first, 

16 Example 8 

Construction of a Stab l e , Large Insert Picoplankton Genomic DNA Library 

[0057] Cell collection and preparation of DNA, Agarose plugs containing concentrated picoplankton ceils were 
20 prepared from samples collected on an oceanographtc cruise from Newport, Oregon to Honolulu, Hawaii, Seawater 
(30 liters) was collected in Niskin bottles, screened through 10 \xm Nitex r and concentrated by hollow fiber filtration 
(Amicon DC10} through 30,000 MW cutoff polyfuifone filters, The concentrated bacterjoplankton ceils were collected 
on a 0.22 (im f 47 mm Durapore filter, and resuspended in 1 ml of 2X STE buffer (1M NaCI, 0,1 Ml EDTA, 10 mM Iris, 
pH 8,0) to a final density of approximately 1 x 10 10 cells per ml. The ceil suspension was mixed with one volume of 
25 1 % molten Seaplaque LMP agarose (FMC) cooled to 40 & C, and then immediately drawn into a1 ml syringe. The syringe 
was sealed with parafilm and placed on ice for 10 min. The cell-containing agarose plug was extruded into 10 ml of 
Lysis Buffer {10mM Tris pH 8.0, 50 mM NaCI, 0.1 M EDTA, 1 % Sarkosyi F 0.2% sodium deoxycholate, 1 mg/ml fysozyme) 
and incubated at 37°C for one hour The agarose plug was then transferred to 40 mis of ESP Buffer (1 % Sarkosyl, 1 
mg/ml proteinase K, in 0.5M EDTA), and incubated at 55°C for 16 hours, The solution was decanted and replaced with 
30 fresh ESP Buffer, and incubated at 55°C for an additional hour The agarose plugs were then placed in 50 mM EDTA 
and stored at 4°C shipboard for the duration of the oceanographic cruise. 

[0058] One slice of an agarose plug (72 prepared from a sample collected off the Oregon coast was dialyzed 
overnight at 4°C against 1 ml of buffer A (100mM NaCI, 1QmM Bis Tris Propane-HCI, 100 ng/ml acetyiated BSA: pH 
7,0 @ 25°C) in a 2 mL microcentrifuge tube. The solution was replaced with 250 \x\ of fresh buffer A containing 10 mM 

35 MgCI 2 and 1 mM DTT and incubated on a rocking platform for 1 hr at room temperature. The solution was then changed 
to 250 ni of the same buffer containing 4U of Sau3A1 (NEB), equilibrated to 37 Q C in a water bath, and then incubated 
on a rocking platform in a 3 7 °0 incubator for 45 min. The plug was transferred to a 1.5 ml microcentrifuge tube and 
incubated at 68°C for 30 min to inactivate the enzyme and to melt the agarose. The agarose was digested and the 
DNA dephosphorylased using Gelase and HK-phosphatase (Epicentre), respectively, according to the manufacturer's 

40 recommendations, Protein was removed by gentle phenol/chloroform extraction and the DNA was ethanol precipitated, 
pelleted, and then washed with 70% ethanol, This partially digested DNA was resuspended in sterile H 2 0 to a con- 
centration of 2,5 ng/pJ for ligation to the pFOS 1 vector. 

J0059] PGR amplification results from several of the agarose plugs (data not shown) indicated the presence of sig- 
nificant amounts of archaeal DNA, Quantitative hybridization experiments using rRNA extracted from one sample, 
collected at 200 m of depth off the Oregon Coast, indicated that planktonic archaea in (this assemblage comprised 
approximately 4,7% of the total picoplankton biomass (this sample corresponds to "PACr-200 m in Table 1 of DeLong 
etaL, high abundance of Archaea in Antarctic marine picoplankton, Nature, 371:695-698, 1994), Results from archaeal- 
biased rDN A PGR amplification performed on agarose plug lysates confirmed the presence of relatively large amounts 
of archaeal DNA in this sample. Agarose plugs prepared from this picoplankton sample were chosen for subsequent 

50 fosmid library preparation. Each 1 ml agarose plug from this site contained approximately 7.5 x 10 5 cells, therefore 
approximately 5,4 x 10 5 cells were present in the 72 ^il slice used in the preparation of the partially digested DNA, 
[0060] Vector arms were prepared from pFOS1 as described (Kim et ai, Stable propagation of casmid sized human 
DNA inserts In an F factor based vector, NucL Acids Res., 20:10832-1 0835, 1992). Briefly, the plasmid was completely 
digested with Astll, dephosphorylated with HK phosphatase, and then digested with BamHI to generate two arms, 

55 each of which contained a cos site in the proper orientation for cloning and packaging llgated DNA between 35-45 kbp. 
The partially digested picoplankton DNA was iigated overnight to the PFOS 1 arms in a 1 5 \i) ligation reaction containing 
25 ng each of vector and insert and 1 U of T4 DNA ligase (Boehringer-Mannheim). The Iigated DNA in four microliters 
of this reaction was in vitro packaged using the Gigapack XL packaging system (Stratagene), the fosmid particles 
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transfecied to E. coli strain OH10B (BRL} r and the cells spread onto L8 cm15 plates. The resultant fosmid clones were 
picked into 96-well microliter dishes containing LB cmi5 supplemented with 7% glycerol. Recombinant fosmids, each 
containing ca. 40 kb of picoplankton DNA insert, yielded a library of 3.552 fosmid clones, containing approximately 1 .4 
x 10 B base pairs of cloned DNA. All of the clones examined contained inserts ranging from 38 to 42 kbp. This library 
s was stored frozen at -80° C for later analysis, 

[0061] Numerous modifications and variations of the present invention are possible in light of the above teachings; 
therefore, within the scope of the claims, the invention may be practiced other than as particularly described. 
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TABLE 1 



10 



15 



A2 

Flume seem Cftftf uf a*cd casein (3.2 mol Dunrescein/mol caseinl 

CB^AJ»-AMC 

U&OC*AU-A|a.Aip-AMC 

cucciny1*Al«*Gly4.cu-AMC 

CBZ-AxgAMC 

CBZ*Mc<-AMC 

cnorphourc-t-Phc - AMC 

l-BOC = l-buioxy cirbonyl. CBZ ~ cartonyl bcnryloxy, 
AMC i= 7*wruno-4-rnethyl coumtrin 



20 



25 



30 



A A3 



AB3 

HN NH HN NH 

■o o 



AD3 

Fluorescein conjugated casein 

{ BOC- Ala-Al*-AtfvAFC 
CBZ- Ab-Aii-Lyi-AFC 
succinyl-AU*Ah~Phe*AFC 
»uo:inyJ-AU-Gly*Leu-AFC 

AFC = 7*amino^-trifluoromelhyl cogmvirU 



AC3 



Y 

o 



Y 

o 



35 



40 



AE3 

Fluorescein conjugsied 



AF3 

i-BOO AU-Aii-A<p-AFC 
CBZ AspvAFC 



AH3 

trcdnyt-Ah-AUHic-AFC 

CBZPbe-AFC 

CBZ-Trp*AFC 



AI3 



45 



AG3 

CBZ AU^Ai* Lyr AFC 
CBZ Aig-AFC 



iuccinyl*Ala»< xu-AFV 

CSZ-Ab-AFc: 

C&ZSewrAK* 



50 
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12 



CH 3 CH a rH, <j*> SH 



LA3 LB 3 



LD3 




O t 


LF3 


LE3 


5" 


And all ofL2 





LG3 



CIS 



EP 1 528 067 A2 



TABLE 3 

o<Wo 



And all of L2 



o 



U3 



LK3 LL3 



LN3 




COrCHj-Ph 



L03 



o 
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TABLE 4 




4-methyl umbelliferone 
wherein R — 



/?~D-galactose 

#~D~glucose 

£~D*glucuronide 

£~D-ceIlotrioside 

ff-B-cellobiopyranoside 

#~D-gaiactose 

t*-D-galaciose 

/3-D-glucose 

a-D-glucose. 

/3-D-giucuronide 

^D^N.N-diaceryichitobiose 

/J~D-fucose 

oc-L-fucose 

0~L*fucose 

j3-D-manno$e 

a-D-mannose 



non-Umbelliferyl substrates 

amylose [polygiucan al,4 linkages], amylopectin 
[polygiucan branching a 1,6 linkages! 
xyian [poly 1 ,4-D-xylan] 
amylopectin, puliuian 
sucrose, fructofuranoside 
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10 



SEQUENCE LISTING 

{!}' GENERAL INFORMATION 

(i) APPLICANT: Recombinant Biocatalysis , Inc, 



(ii) TITLE OF INVENTION: PRODUCTION AND USE OF NORMALIZED DNA 

LIBRARIES 

(iii) NUMBER OF SEQUENCES: 10 



(iv) CORRESPONDENCE ADDRESS: 

1 *y 

(A) .ADDRESSEE; FISH & RICHARDSON 

(B) STREET: 4225 EXECUTIVE SQUARE , STE. 1400 

(C) CITY; LA JOLLA 

(D) STATE: CA 

(E) COUNTRY: USA 
20 (F) ZIP: 92037 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3,5 INCH DISKETTE 

(B) COMPUTER: IBM PS/2 

25 (C) OPERATING SYSTEM: MS-DOS 

(D) SOFTWARE: WORD PERFECT 6,0 

(vi) ' CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Unassigned 

(B) FILING DATE: 18 June 1997 
30 (C) CLASSIFICATION: Unassigned 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/665,565 
{B) FILING DATE; 18 June 1996 
{C} CLASSIFICATION: 

35 

(viii) ATTORNEY/AGENT INFORMATION : 

(A) NAME: LISA A* KAILB, Ph,D, 
{B} REGISTRATION NUMBER: 38,347 

(C) REFERENCE/ DOCKET NUMBER: 09010/019WO1 

40 Use) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 619-678-5070 

(B) TELEFAX: 619-678-5099 



45 



SO 
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(2) INFORMATION FOR SEQ ID NO;l: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 52 NUCLEOTIDES 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: cDNA . 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CCGAGAATTC ATTAAAGAGG AGAAATTAAC TATGATTGAA GACCCTATGG AC 



(2) INFORMATION FOR SEQ ID NO: 2: 

{t} SEQUENCE CHARACTERISTICS 

(A) LENGTH: 31 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: cDNA 
(Xi) . SEQUENCE DESCRIPTION: SEQ ID NO:2: 
CGGAAGATCT TTAAGCACTT CTCTCAGGTT C 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 52 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCGAGAATTC ATTAAAGAGG AGAAATTAAC TATGGACAGG CTTGAAAAAG TA 



(2) INFORMATION FOR SEQ ID NO*: 4 : 

(i) ' SEQUENCE CHARACTERISTICS 

(A) LENGTH; 31 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: cDNA 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CGGAAGATCT. TCAGCTAAGC TTCTCTAAGA A 
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(2 > INFORMATION FOR ' SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 52 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
■(D) TOPOLOGY; LINEAR 

(ii) MOLECULE h TYPE ; cDNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:5; 
CCGACAATTG ATTAAAGAGG AGAAATTAAC TATGTGGGAA TTAGACCCTA AA 



(2) INFORMATION FOR SEQ ID NO: 6: 

fi) SEQUENCE CHARACTERISTICS 

(A) LENGTH: • 31 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(id) MOLECULE TYPE: cDNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CGGAGGATCC CTACACCTGT TTTTCAAGCT C 



(2) INFORMATION FOR SEQ ID NO; 7: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH; 52 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 
. (D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: ■ 
CCGACAATTG ATTAAAGAGG AGAAATTAAC TATGACATAC TTAATGAACA AT 



(2) INFORMATION FOR SEQ ID NO: 8: , 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 31 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
■ (D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CGGAAGATCT TTATGAGAAG TCCCXTTCAA G 
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(2) INFORMATION FOR SEQ ID NO: Bt 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 52 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CCGAGAATTC ATTAAAGAGG AGAAATTAAC TATGCGGAAA CTGGCCGAGC GG 



{2) 1 INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 31 NUCLEOTIDES 
<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 
{£ ) ' TOPOLOGY : LINEAR 

(ii) MOLECULE TYPE; ' cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
CGGAGGATCC TTAAAGTGCC GCTTCGATCA A 



Claims 



1. A method for forming a normalized DNA library from a mixed population of organisms, which comprises: 

(a) obtaining a DNA population from the mixed population of organisms; 

(b) at least one of the steps selected from the group consisting of (i) amplifying the copy number of the DNA 
population so isolated and (ii) recovering a fraction of the isolated DNA having a desired characteristic; and 

(c) normalizing the representation of various DNAs within the DNA population so as to form a normalized 
library of DNA from the mixed population of organisms. 

2. The method of claim 1, further comprising prior to (b) fractionating the DNA population by contacting the DNA 
population with an intercalating agent and separating the DNA, 

3. The method of claim 2, wherein the intercalating agent is bis-benzimide. 

4. The method of claim 1 , which comprises recovering a fraction of the isolated DNA having a desired characteristic. 

5. The method of claim 1 , which comprises amplifying the copy number of the DNA population so isolated. 

6. The method of claim 1, wherein the amplifying the DNA precedes normalizing. 

7. The method of claim 1, wherein normalizing the DNA precedes amplifying. 

8. The method of claim 1, which comprises both the steps of (i) amplifying the copy number of the DNA population 
so isolated and (ii) recovering a fraction of the isolated DNA having a desired characteristic. 

9. A normalized DNA library formed from a mixed population of organisms by a method comprising; 
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(a) obtaining a DNA population from the mixed population of organisms; 

(b) at least one of (i) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
of the isolated DNA having a desired characteristic; 

(c) normalizing the representation of various DNAs within the DNA population; and 

{d} transforming host cells with the DNA of (c) so as to form a normalized library of DNA from the mixed 
population of organisms. 

10. The library of claim 9, further comprising prior to (b) fractionating the DNA population by contacting the DNA 
population with an intercalating agent and separating the DNA. 

11* The library of claim 10, wherein the intercalating agent is bis-benzimide. 

12. The library of claim 9, wherein the method of forming said library comprises recovering a fraction of the isolated 
DNA having a desired characteristic. 

13. The library of claim 9, wherein the method of forming said library comprises amplifying the copy number of the 
DNA population so isolated. 

14. The library of claim 9, wherein in the method of forming said library amplifying the DNA precedes normalizing. 

15. The library of claim 9, wherein in the method of forming said library normalizing the DNA precedes amplifying. 

16. The library of claim 9, wherein the method of forming said library comprises both the steps of (i) amplifying the 5 
copy number of the DNA population so isolated and (ii) recovering a fraction of the isolated DNA having a desired 
characteristic. 

17. A method for producing a normalized library of gene clusters from a mixed population of organisms, which com- 
prises: 

(a) obtaining a DNA population from the mixed population of organisms; 

(b) at least one of (i) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
of the isolated DNA having a desired characteristic; and 

(c) normalizing the representation of various DNAs within the DNA population so as to produce a normalized 
library of DNA from the mixed population of organisms. 

18. The method of claim 17, further comprising prior to (b) fractionating the DNA population by contacting the DNA 
population with an intercalating agent and separating the DNA. 

19. A normalized library of gene clusters formed from a mixed population of organisms by a method comprising: 

(a) obtaining a DNA population from the mixed population of organisms; 

(b) at least one of (i) amplifying the copy number of the DNA population so isolated and (ii) recovering a fraction 
of the isolated DNA having a desired characteristic; and 

(c) normalizing the representation of various DNAs within the DNA population so as to form a normalized 
library of DNA from the mixed population of organisms. 
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